perm filename APP2[AIM,DBL]3 blob
sn#125921 filedate 1974-10-22 generic text, type T, neo UTF8
00100 .DEVICE XGP
00200 .FONT 1 "FIX25"
00300 .FONT 2 "SIGN57"
00400 .FONT 3 "SHD40"
00500 .FONT 4 "BDI25"
00600 .FONT 5 "NGB30"
00700 .FONT 6 "NGR20"
00800 .TURN ON "↓_π{"
00900 .TURN ON "⊗" FOR "%"
01000 .MACRO B ⊂ BEGIN VERBATIM GROUP ⊃
01100 .MACRO E ⊂ APART END ⊃
01200 .TABBREAK
01300 .COMPACT
01400 .EVERY FOOTING(⊗6Fourth Draft .... {DATE},page A2.{IF PAGE = 1 THEN 1 ELSE PAGE},BEINGs' knowledge⊗*)
01500 .EVERY HEADING(⊗3BEINGS⊗*,,⊗4Doug Lenat⊗*)
01600 .COUNT PAGE PRINTING "1"
01700 .NEXT PAGE
00100 .SELECT 1
00200
00300 ⊗2APPENDIX 2. ⊗* ⊗3THE BEINGS⊗*
00400
00500 Here we present summaries of the knowledge embedded in each BEING.
00600 Only the most important parts of each BEING are even mentioned.
00700 First we present those BEINGS used to write CF. Next come the ones we
00800 had to add to the pool to get PUP6 to write GI and INF. Among the
00900 first group, we further subdivide it into (a) high-level,
01000 domain-specific, (b) low-level domain-specific, (c) ubiquitous,
01100 domain-independent, (d) high-level, domain-independent, (e)
01200 low-level, domain-independent, (f) non-BEING knowledge in variables,
01300 and (g) a few interesting demons and functions. The additions
01400 following CF are so small we don't subdivide their descriptions.
01500
01600 .SELECT 5
01700
01800 (i) The knowledge necessary to write a concept formation program:
01900
02000 A. High-level, domain-specific knowledge
02100
02200 .SELECT 1
02300
02400 ⊗4CONCEPT-FORMATION⊗*.
02450 Each interesting BEING part is translated into English. Just this once,
02460 the name of the part will be (parenthesized) preceding the translation.
02470 (pre-requisite:)
02475 The user must be aware that we are about to
02500 undertake concept formation.
02510 (demons:) Inference and attention-focussing demons
02600 must be turned on.
02610 (effects:) After completion of this task, PUP6 will be able
02700 to learn concepts.
02710 (generalizations:) This is a specialized form of attending, learning,
02800 and doing inductive inference.
02810 (alternatives:) It is an alternative to grammatical
02900 inference, pattern recognition, and simulated evolution.
02910 (specializations:) Its
03000 structure must be one of the following: classificatory concept
03100 formation, comparative concept formation, or metrical concept
03200 formation. We must make the boolean decision as to whether or not
03300 concepts may vary with time. Similarly, whether the speed of
03400 presentation of the stimuli is relevant; if so, then we must
03500 constrain the effort spent on various phases of identification.
03600 Instances may be left in view indefinitely, or may be removed right
03700 after processing; this latter case holds for CF; it means we must
03800 derive all relations (features) as soon as we see a scene. The
03900 program will have to be just complex enough to handle conjunctive,
04000 disjunctive, or both kinds of concepts; this is another decision to
04100 make. Similarly for positive, negative, both, or neither kinds of
04200 transfer (psychological), which affects the recognition that a
04300 concept is new, and how previously learned concepts interact with the
04400 learning of new ones. We must decide whether to use positive,
04500 negative, or both kinds of instances of a concept. Subject-specific
04600 behavior may be required; that is, the program may have to input a
04700 vector describing a particular individual, and its whole structure
04800 must mimic this subject. The last decision is one of adapting the
04900 program to an extended sample dialogue which the user must furnish;
05000 this will help both to check out the program writtten, and to fix
05100 various print statements and I/O formats.
05110 (complexity-vector:) It is easy to call this
05200 BEING (.1), it has a 50-50 chance of calling* itself, it has only a
05300 0.5 chance of succeeding, but the effort to try it is moderate (.5).
05400 There is no fundamental reason for delaying its investigation (.1).
05500 (iden:) It recognizes
05510 itself only through exact matching of one of seven
05600 patterns. (what, how, why:)
05610 It has sentences describing what it does, how, and why. (when:) It
05700 is unlikely (-70) to be called if some type of concept formation is
05800 already doable by PUP6; if PUP6 wants to characterize classes, then
05900 it's very likely (88) to be called. The presence of new information
06000 delays (-60) our calling the BEING, since it might affect what we
06100 should do. Conversely, the absence of new information mildly (40)
06200 encourages us to go on and try it.
06210 (post:requisites:) When finished, the user must be
06300 aware that PUP6 has decided on a particular type of concept
06400 formation, and that it has done it. (affects:)
06410 The other BEINGs affected depend
06500 on the decisions mentioned earlier.
06600
06700 This is an over-abundance of information. From now on, I will
06800 only give the little pieces of information which are crucial to the
06900 BEING; its essence. Also, the BEING part names won't be explicitly
06910 provided.
07000
07100 ⊗4CLASSIFICATORY CONCEPT FORMATION⊗*. To do this, we must partition a
07200 domain, in an interactive "guessing" manner.
07300
07400 ⊗4COMPARITIVE CONCEPT FORMATION⊗*. Same as above, but then we must
07500 partially order the equivalence classes of the partition. It is
07600 harder, also.
07700
07800 ⊗4METRICAL CONCEPT FORMATION⊗*. Same as previous BEING, but we must
07900 also induce a metric on the partial ordering of the classes. This is
08000 even more complicated.
08100
08200 Since we actually do classificatory CF, the BEINGs to order a
08300 partition and to metrize an ordering weren't implemented.
08400
08500 ⊗4PARTITION A DOMAIN⊗* in a guessing, interactive manner. The
08600 partition may be only partial, it may be only weak, and, most
08700 crucially, the BEING must be able to do some of these: partition by
08800 accepting an element and getting its class name (guessing the name
08900 and then checking it somehow), partition by accepting a class name
09000 and getting its member elements, partition by accepting ordered pairs
09100 <element, classname>. The fringe of conciousness demon must be
09200 activated from now on.
09300
09400 ⊗4PARTITION BY TAKE ELEMENT GET CLASS⊗*. Take hold of an element (by
09500 reading, for example), and then work to get the name of the class it
09600 belongs to (by guessing, then verifying the guess, for example). Then
09700 modify the structure of the class(es) involved.
09800
09900 ⊗4PARTITION BY TAKE CLASS GET ELEMENT⊗*. Take hld of a class name,
10000 and then work to get elements it contains. Then modify the structure
10100 of the class and the element(s) involved.
10200
10300 ⊗4PARTITION BY TAKE ELEMENT AND CLASS⊗* simultaneously. Taake hold of
10400 both an element and its corresponding class name, and use these to
10500 modify the structure of the partition; i.e., modify the class
10600 mentioned if the partition is stored by classes.
10700
10800
10900 ⊗5B. Low-level, domain-specific knowledge⊗*
11000
11100 ⊗4RECOGNIZE CONTRADICTION⊗*. It can translate "... is incompatible
11200 with ...". It is a predicate, fairly easy to write therefore. It is
11300 composed of SOMEOF the following: Probability≡0 contradiction,
11400 Probability≡1 contradiction, Probability >0 and <1 contradiction.
11500 Since these names are fairly cryptic, some of their parts (e.g, HOW)
11600 must be printed out to help the user choose (whenever they are
11700 involved, if the user asks for ramifications.)
11800
11900 ⊗4PROBABILITY ≡ 0 CONTRADICITON⊗*. Since this is a very simple thing
12000 in our domain of concept formation, it is immediately encodable as
12100 (MEMBER arg1 arg2). That is, if arg1 has probability zero of
12200 occurring in arg2, yet it does, then we have a contradiction.
12300
12400 ⊗4PROBABILITY ≡ 1 CONTRADICTION⊗*. Immediately encodable as (NOT
12500 (MEMBER arg1 arg2)). If arg1 has probability one of occurring in
12600 arg2, yet it doesn't, then we have a contradiction.
12700
12800 ⊗4PROBABILITY >0 AND <1 CONTRADICTION⊗*. Immediately enacodable as
12900 NIL. If arg1 might and might not occur in arg2, we can't get a
13000 contradiction just by checking its membership. Of course, the idea
13100 that this is the only way to prove contradiction is what makes these
13200 BEINGs domain-specific!
13300
13400 ⊗*.SCENE⊗*. This is a data structure, composed of four subparts. The
13500 first is a set O of objects. Next is an atom indicating the name N of
13600 the scene. Next come two lists of features, where a feature is just a
13700 predicate and its arguments. The first is the static relations S
13800 between objects. Finally we have the dynamic relations D between
13900 objects.
14000
14100 ⊗5C. Ubiquitous, problem-independent BEINGs and functions⊗*
14200
14300 ⊗4CHOOSE FROM⊗*. All its arguments must be BEINGs, else it prints a
14400 nasty warning message. We select the best BEING and apply it. If it
14500 fails, we re-order the remaining BEINGs and apply the best one of
14600 them. Note that this new reordering may use knowledge gleaned during
14700 the earlier, unsuccessful BEING try. Thus, this is a (possible)
14800 intelligent nondeterministic branch point. The intelligence lies (or
14900 fails to lie) in the comparison function, BETTER, which decides who
15000 goes next.
15100
15200 ⊗4SATISFY⊗*. This is the equivalent of a pattern-directed goal
15300 statement. We ask each BEING, "Can you do anything matching x?". We
15400 take the list of those answering affirmatively, and CHOOSE FROM it
15500 one BEING after another until the desired effects are realized.
15600 Notice that a BEING who said "probably" may succeed in his
15700 application and yet not effect the result we wanted, so that a
15800 trivial call on CHOOSE FROM is insufficient. The BEINGs possibly
15900 affected are just those answering affirmatively.
16000
16100 ⊗4MESSAGE⊗*. This BEING has a main effect (AWARE USER x), hence is
16200 very frequently called. The forgetful-user demon trims the aware
16300 user list periodically. Message looks to see if its message is on
16400 that list; if not, it inserts it and prints it out to the user. If it
16500 is, it moves the message to the front of the aware-user list and
16600 prints out nothing. This is an example of a specialized, fixed
16700 assertion-list, as described earlier.
16800
16900 ⊗4DETERMINE ARG VALUES⊗*. These functions take as input the name of a
17000 function, and output a description of what arguments it will ever be
17100 called with (in the existing code.) For example, it might say "arg1
17200 will always be NAME:OF:CLASS, and arg2 will consecutively be all
17300 integers from 3 to (LENGTH SET:OF:CLASSES)". At present these work in
17400 the obvious way, looking at everything. The tremendous amount of CPU
17500 time spent in these functions indicates that I should have made
17600 special assert:lists for argument instantiations, and updated them
17700 each time a BEING is called in the target code.
17800
17900 ⊗4FLOW-PRECEDED⊗*. This BEING must search through he code to find a
18000 form matching a given pattern. Although it is used under ten times
18100 in the total dialog, it is so costly that I've implemented it as an
18200 ask-the-user call. Work must be done here to understand why this is
18300 so inefficient, and to remedy it.
18400
18500 ⊗4FIND AND TAG⊗*. This BEING is similar to flow-preceded, except that
18600 the pattern-matching is between two constant strings. This is
18700 tolerably efficient in CPU time and is used heavily throughout the
18800 writing of CF.
18900
19000 ⊗4TRANSLATE⊗*. The natural language front end is managed by this
19100 BEING. It asks each BEING whether it recognizes a given string.
19200 Translate then takes the "best" -- the most probable -- of these as
19300 the translation, and can backtrack and reorder the remaining
19400 interpretations if it has to. If called with no argument, it examines
19500 various assert-lists to see if it can do any good. The idiom demon
19600 must be activated during the control period of this BEING.
19700
19800 ⊗4REINVESTIGATE DECISION⊗*. This is usaully called by a demon who
19900 watches the deferred-decision assert list. We transfer the decision
20000 in question from the deferred to the undeferred decision assert list.
20100 A deferral demon will promptly react to anything on this latter list.
20200 An interesting caution: it was necessary to inhibit all demons during
20300 the execution of this BEING, for reasons very similar to those
20400 leading to lock and unlock system commands. The fact that some BEING
20500 might have to be demon-uninterruptable forced us to institute an
20600 entire new question asking just about this tiny point!
20700
20800 ⊗4DEFER DECISION⊗*. Remove the decision from the undeferred decision
20900 assert list. Determine the situation when we must next reinvestigate
21000 this decision. This will be some predicate examing the state of the
21100 world. If this predicate is true currently, we must resolve the
21200 decision now. Otherwise, we put the decision on the deferred decision
21300 list, attached to its (new) reinvestigation predicate. Demons must be
21400 inhibited during this BEING's reign, to ensure that its notions of
21500 the world are accurate upon exit.
21600
21700 ⊗4WHEN NEXT⊗*. Manipulate the decision to extract the name of the
21800 variable holding information relevant to deferring the decision.
21900 Utilize this knowledge so as to keep the effects of the decision
22000 irrelevant; i.e., find the (next) situation in which they are not
22100 irrelevant. Whoever called this BEING is now asserted to be aware of
22200 its results. This is fairly complex, and the BEING is not called
22300 unless it is necessary. As it happens, it is called a few times for
22400 every decision to be made about every BEING in the target program
22500 (several hundred times).
22600
22700 ⊗4UTILIZE⊗*. This BEING applies various knowledge "variables,"
22800 starting at specific ones and moving toward very general ones, until
22900 one of them reports it is able to acheive the desired goal.
23000
23100 ⊗4RESOLVE DECISION⊗*. Again, all demons must be inhibited. After some
23200 preliminary searching and very trivial theorem-proving fail, this
23300 BEING resorts to asking the user about how to resolve a given
23400 decision.
23500
23600 ⊗4ASK USER ABOUT⊗*. We determine the argument instantiations of the
23700 little piece of code we're worrying about, determine the type of
23800 decision to be made, and apply the specific knowledge variable for
23900 x-ing that type of decision. Here, we get x by examing who called
24000 this BEING and why. To write a specialized version of ask-user-about,
24100 we just write a standard print, read, and assign function, with the
24200 details left unspecified until the sample session is read in.
24300
24400 ⊗4BETTER⊗*. This function is used to compare two BEINGs, and decide
24500 which of them should gain control. It evaluates their WHEN parts,
24600 and if they tie it evaluates their complexity vectors. Note that
24700 "eval" here is not trivial: each dimension of the complexity vector
24800 of a BEING can be a little program which examines itself, other
24900 BEINGs, and the state of the world before deciding on a numerical
25000 answer to return.
25100
25200 ⊗5Handling of User Interrupts⊗*. There are several functions and
25300 BEINGs involved in this process. Initially, the user describes how
25400 often the system is to give him the opportunity to interrupt and
25500 query it. At each of these times, the HANDLE USER INTERRUPT function
25600 asks the user if he wants to interrupt; if so, PROCESS USER INTERRUPT
25700 is called to do the job. In addition to asking for pieces of any
25800 BEING, the user may request limited simulated execution of various
25900 pieces, and may order the current BEING to FAIL.
26000
26100
26200
26300 ⊗5D. High-level, problem-independent knowledge: how to write
26400 programs⊗*
26500
26600 ⊗4SERVE⊗*. Obtain information until some of it is "executable," then
26700 carry it out. The forgetful-user demon is activated. Without this
26800 top-level purpose, PUP6 sits contentedly, never wanting to accept a
26900 new task.
27000
27100 ⊗4WRITE PROGRAM⊗*. The user must be made aware that PUP6 is about to
27200 write a program, what kind of program it is, what its name is (this
27300 will force a get-name BEING call), and that its type has been
27400 examined (this will cause a study-type BEING call). Upon exit, he
27500 must be told that PUP6 has completed the task, and what its new
27600 capabilities are. To wite a program, one enters a loop, broken only
27700 when several completion conditions are all true simultaneously: the
27800 top-level task is now a BEING, there are no undefined sections of
27900 code, there are no warnings left about the code, there is no
28000 executable information anywhere, there is no new but unprocessed
28100 information, there are no decisions still pending (except those
28200 requiring "everything else" to be complete; e.g., the adaptation of
28300 output formats using a sample session). If we do break out of the
28400 loop, we must update the list of programs written, the list of what
28500 PUP6 can now do, of what the user may do, we find the set of support
28600 of the top-level task and create a new file with the relevant
28700 functions and BEINGs (which automatically does global initializations
28800 and then enters the top-level task instead of SERVE). In general,
28900 of course, we won't break out, so we activate all the current demons
29000 and go on. All the body of the loop is is one CHOOSE FROM, between
29100 six alternatives: obtaining some usable info, using some usable info,
29200 filling in some function call which is currently undefined,
29300 clarifying some little piece of code known to be improbable for some
29400 specific reason, adapting some known function to conform to some
29500 specific new requirements, fixing some piece of code which doesn't
29600 work the way it claims to work. The last two of these are simply
29700 program modification and debugging, respectively! Failure of one of
29800 these six BEINGs simply causes CHOOSEFROM to try another one; failure
29900 of a demon causes the whole WRITE PROGRAM BEING to fail. During its
30000 reign, the program-writing demons, deferral-demon, and
30100 reinvestigation-demon are all activated. Its complexity vector is
30200 dependent upon that of the BEING closest to the task it must perform.
30300
30400 ⊗4OBTAIN USABLE INFORMATION⊗*. The WHEN part informs us that this is
30500 always undesirable (-10) but is OK (111) if there exists new but not
30600 yet usable information. All we do here is CHOOSE FROM the following
30700 four alternatives: translate something, get brand new information,
30800 analyze the implications of existing information, extract a small
30900 relevant subset of the existing information to concentrate on.
31000
31100 ⊗4USE INFORMATION⊗*. This demands that some executable information
31200 exist. We select one such piece, and try to execute it. If we fail,
31300 its worth is decreased; if we succeed, it is removed from the
31400 executable info assert list.
31500
31600 ⊗4FILL IN UNDEFINED SECTION⊗*. There must be some undefined section.
31700 If so (80) we don't want any hi-priority (≤20) coding warnings still
31800 around (-150), and we do like there to be something both undefined
31900 and known to be encodable (96). We fix a choice of what to encode,
32000 and try to acheive its encoding. If we fail, we update the difficulty
32100 of that choice, and may assert that we want some specific new
32200 information to relieve the problem. In addition to ENCODE, this
32300 BEING also may call MAKE ENCODABLE and STUDY TYPE.
32400
32500 ⊗4CLARIFY IMPROBABLE SITUATION⊗*. This BEING demands that something
32600 of mediocre priority (≤500) exist on the coding warning assert-list.
32700 It likes (51) this, and dislikes (-41) anything on the undefined
32800 section list, or anything (-200) on the encodable section list. As
32900 always, a sentence is provided to justify each of these little
33000 beliefs. We choose the warning with the highest priority (lowest
33100 numerical weight) on the coding warning list, note that that is what
33200 PUP6 is working on now, and do a match to decide what type of warning
33300 it is. (i) Replace x by y in z. Here these may be nonspecific; z may
33400 be "in all code recently generated". The nature of y may cause us to
33500 include new warnings; y may mention a new data structure. (ii) x in y
33600 is undefined; probably z since r. This may cause us to add to the
33700 global initialization list. It will probably cause us to ask the user
33800 what the answer is. (iii) x is a data structure but we don't know
33900 much about it. We try to find out its structure, how to initialize,
34000 access, insert, delete, update it. A variant of this warning is: (iv)
34100 We find no x's associated with data structure y. Here x can point to
34200 insertions, deletions, initializations, etc. If we can't justify the
34300 lack, we try to defer the decision. Failing that, we ask the user.
34400 (v) Command: if x then y. This is a programmed demon; when situation
34500 x is true, we must do y. (vi) Delete all mention of x. This is like a
34600 replace, but we go through the assert lists with an eye toward
34700 deleting unnecessary worries. (vii) Infinite loop in x from y to z.
34800 If we can't justify this, we insert a test to break out of the loop.
34900 Justification might be that this loop is in the top-level function of
35000 the system, where we never wnat to break out. (viii)
35100 Incomprehensible: x, y. (there is a "bug" in x manifesting itself as
35200 y) Never needed to write CF, so not implemented. Should call FIX
35300 INCORRECT PIECE (which is also not in yet) or ask the user for
35400 assistance.
35500
35600 ⊗4GET NEW INFORMATION⊗*. Naturally, it is not thrilled if (-68) there
35700 exists some new but unexamined information, and it is happy (50) if
35800 there is none. The prerequisites ensure that the user is aware of
35900 what PUP6 wants, and if the theorem prover can't deliver it, PUP6
36000 asks the user for some. If PUP6 asks for something general ("any
36100 task") it is because it knows precisely that this is what it wants,
36200 not out of ignorance! During execution, the specificity check demon
36300 is active; he ensures that it is indeed phrased as specifically as
36400 possible; if not, MAKE SPECIFIC will be called. This is a very
36500 uncomplicated BEING, and a very unpopular one to use since we should
36600 squeeze every drop of meaning out of what we have before asking for
36700 more information.
36800
36900 ⊗4EXTRACT RELEVANT SUBSET⊗*. This likes (70) there to be a great deal
37000 (≥50 pieces) of new information, and dislikes (-80) it if there are
37100 under a dozen such tokens. It finds and evaluates knowledge variables
37200 to constrain what should be looked at currently. This is never called
37300 in the dialog, though it was in the protocol.
37400
37500 ⊗4ANALYZE IMPLICATIONS⊗*. The WHEN part is unhappy (-60) if there is
37600 usable information already, since this BEING is fairly costly. It
37700 also examines the size of the new info list to see just how long this
37800 search will be. The BEING locates the code that will be affected,
37900 predicts the affects, and sees how desirable this is. This BEING is
38000 also never needed to write CF.
38100
38200 ⊗4MAKE ENCODABLE⊗*. If all else fails, this BEING tries replacing a
38300 function by one of its alternatives or, as a last resort, by one of
38400 its generalizations. This last resort is never needed to write CF.
38500
38600 ⊗4ENCODE⊗*. Despite its key name, this BEING is mostly a bookkeeper!
38700 It runs around the assert lists, gathering up enough information to
38800 encode a specified little piece of code. A program schema is built
38900 up, instantiated as much as possible, printed out for the user to
39000 refer to, and passed to a highly optimized recursive auxilliary
39100 function, GETCODE. Some worrying about the arguments is
39200 done,including what they might be instantiated as. We inform the user
39300 of the code BEING written, and a prerequisite causes the function to
39400 become made into a BEING.
39500
39600 ⊗4STUDY TYPE⊗*. This BEING accepts a BEING call, looks at that
39700 BEING's SPECIALIZATIONS part, translates each entry into a decision,
39800 and dumps these onto the undeferred decision list. A deferral demon
39900 promptly attends to them.
40000
40100 ⊗4GET DATA STRUCTURE⊗*. We call select-structure type, and use its
40200 advice to point to the proper knowledge variable. We eval it to set
40300 up the various parts of the data structure. In unclear cases, we may
40400 ask the user whether the argument really is a data structure. We
40500 ensure that this object is a BEING (else we make it one,) and we add
40600 warnings to the effect that there might not be any insertions or
40700 deletions; we'll worry about that much later, by another BEING. We
40800 know that the elements of a data structure are themselves usually
40900 data structures, so one the prerequisites says that we must try to
41000 make the arguments into data structures as well.
41100
41200 ⊗4GETCODE⊗*. This is a long, special-case, recursive function. It
41300 goes through a piece of code with two jobs: to build up a list of
41400 arguments to this function, and to get names for specific instances
41500 of general BEINGs. Part of the kludgy character is due to the fact
41600 that there are non-BEINGs in the code: primitive forms (structure,
41700 tuple, vector, class, comment, atoms), primitive functions (read,
41800 null, and, ...), primitive variables (T, NIL, 4, ...). By keeping
41900 closely to the theoretical ideals implicit in the ideas, this
42000 function would probably become trivial.
42100
42200 ⊗4GET NAME⊗*. First we study plausible names for the new specialized
42300 BEING. We make the user aware of them. We examine the BEING name
42400 carefully, to see if it was just mentioned in the same piece of code
42500 (probably then the same name), whether it's ever been mentioned and
42600 specialized, and so on. Sometimes we end up deciding not to get a new
42700 name, sometimes we pick one and just tell the user, sometimes we
42800 recommend some old specialized name. In 90% of the cases, though, we
42900 simply say "give me a name for ...". A new-name counter is maintained
43000 to ensure unique names, and this number is appended onto the end of
43100 what the user types, unless it's an already-existing BEING name. The
43200 user my type ] (and usually does), indicating that PUP6 is to choose.
43300 PUP6 always informs the user what the name is at the end.
43400
43500 ⊗4MAKE NEW BEING⊗*. This BEING has the awesome responsibility of
43600 giving life to new BEINGs. As is the case with neurons, soldiers, and
43700 all (good) BEINGs, no one BEING really does much when you look at it
43800 in isolation. Thus this one already gets the name of the BEING, the
43900 name of its parent (its least general generalization), how the
44000 metacodes of these two differ, the argument list, the reason for this
44100 specialized BEING to exist, what extra qualities it possesses or
44200 lacks wrt its parent, etc. It can figure out who it affects by
44300 examing its various parts, and it bases the complexity vector on that
44400 of the parent and on the changes in this new BEING. Thus it basically
44500 does gets, evals, and puts. It updates various assert lists, and
44600 semi- compiles the new BEING. One bit of knowledge says that the
44700 explicit args check may be significantly different, and the user may
44800 be queried.
44900
45000
45100
45200 ⊗5E. Low-level problem-independent knowledge: bits of
45300 programs⊗*
45400
45500 ⊗4PROPOSE PLAUSIBLE NAMES⊗*. This BEING is called by GetName, and
45600 should incorporate a good model of user psychology of name-choosing.
45700 Currently, it applies Initials, Mainwords, Firstfew, and compositions
45800 of these, to the task description sentence.
45900
46000 ⊗4SEMI COMPILE⊗*. Basically, BEINGs only lend themselves to
46100 interpreting; to help speed up this process, this BEING can take
46200 advantage of regularities and simplicities in another BEING's parts,
46300 and turn out a fast function to do an equivalent job. For example,
46400 if a BEING doesn't enable any new demons, we needn't push the demon
46500 stack at the beginning and pop it at the end.
46600
46700 ⊗4SELECT STRUCTURE TYPE⊗*. This is a true bit of programming
46800 knowledge: if the structure is composed of several weakly-interacting
46900 pieces tied together through association with one atom, then the data
47000 structure should be a list of these atoms, with the associated
47100 structures BEING stored on the property lists of the atoms. If there
47200 are only a couple pieces, or they interact strongly, we should use a
47300 nested list structure instead.
47400
47500 ⊗4ELEMENT⊗*. All we know about an element is that it is a member of a
47600 data structure, and that we should not be ashamed to ask the user to
47700 clarify himself if he mentions this. The ConceptFormation BEING --
47800 not the ELEMENT BEING -- should note that future references to
47900 ELEMENT actually refer to a scene, an instance of a concept, and that
48000 references to class refer to the model of a concept, a set of scenes.
48100 This may change as new data structures come into existence.
48200
48300 ⊗4MODIFY STRUCTURE⊗*. Generally, we will be given a typical element,
48400 and must identify the structures it belongs to, and modify them. The
48500 precise details indicate that some subset of CONDITIONAL INSERTION,
48600 CONDITIONAL DELETION, and COMPLEX ALTERATION will be involved.
48700
48800 ⊗4GET HOLD OF⊗* by guessing and checking. We must discover whether an
48900 algorithm already exists which can get arg1. If not, we try to find
49000 one which can get arg1 and some other effects. We structure the
49100 function as some of COMPUTE, GENERATE&TEST, and SEARCH. Finally, we
49200 must decide now on the type of error recovery desired: none,
49300 backtrack, or correct-and-try-to-proceed.
49400
49500 ⊗4TAKE HOLD OF⊗* trivially. We examine the flow of control through
49600 the preceding code, to decide whether arg1 has ever been SET. If so,
49700 we must ask the user whether or not to read in a new value. If no
49800 read is to be done, then this BEING reduces to a simple assignment,
49900 or perhaps to nothing at all. Otherwise, we read in the object, and
50000 assign its various subparts to SOME PART OF it.
50100
50200 ⊗4IS OF TYPE⊗*. This BEING is a predicate which is too low-level and
50300 general to do much. Basically, it helps formulate a question to the
50400 user, who must explain to PUP6 how to construct any predicate it
50500 comes across, usually just by translating a sentence the user types
50600 in.
50700
50800 ⊗4COMPLEX ALTERATION⊗*. This BEING is always replaced by some
50900 initializing assignments, followed by calls on MODIFY STRUCTURE for
51000 some subparts of arg1. A bit of advice is that if arg1 is
51100 unstructured, try to get it structured. If this fails, maybe what is
51200 really wanted is to modify the structure of which arg1 is a member.
51300
51400 ⊗4CONDITIONAL DELETION⊗*. As above, we look at the structuring of
51500 various arguments to decide what is really supposed to be deleted
51600 from where. We check with the user, remind him of various bindings
51700 relevant to the current call, and ask him to describe (1) under what
51800 conditions we do the deletion, (2) what exactly do we delete. These
51900 are translated, analyzed in the context of deletion, and help
52000 determine the deletion piece of code.
52100
52200 ⊗4CONDITIONAL INSERTION⊗*. This is similar to the preceding BEING.
52300 Here we also worry about whether the entity to be inserted has ever
52400 been bound. If not, we must see that it is! Often, this binding will
52500 be related to the Initialize piece of the DataStructure part of the
52600 BEING representing the structure we're inserting into. Since some
52700 data structures have several similar but distinctly-named elements
52800 existing simultaneously, we have lots of little worries. After we
52900 have planned out the code, we check with the coding warning list and
53000 add to it; e.g., any undefined initial value of a variable in a piece
53100 of code we stuck in here, will also be uninitialized here. If we
53200 later decide never to worry about the first initialization, we must
53300 not forget this one! This is a frequent source of bugs, I think.
53400
53500 ⊗4GENERATE&TEST⊗* and ⊗4COMPUTE⊗* are not needed or implemented.
53600
53700 ⊗4SEARCH⊗*. Currently, a primitive type of searching suffices. We
53800 simply execute the iteration FOREACH X IN XSET DO (TEST X). If we're
53900 unsure, we check that XSET has the form SET OF (PLURAL X). The
54000 specializations tell us to worry about termination conditions. I
54100 suppose this BEING could also be used for generate-and-test.
54200
54300 ⊗4FOREACH⊗*. Not surprisingly, this is the iteration BEING. It is an
54400 NLAMBDA function, so its arguments aren't surrounded by quotes. There
54500 are various minor decisions to make, which simplify any specialized
54600 version: there may or may not be an "until" condition; we must get it
54700 and also decide what to bind the iteration variable to and what value
54800 to return, both in case this condition is met, and in case we exhaust
54900 the space. Often, we will decide to leave some of these unspecified,
55000 put notes about them on the coding warning list, and not worry about
55100 them for a long time.
55200
55300 ⊗4TEST⊗*. This BEING is a predicate which is oriented toward a
55400 threshold of acceptability, whereas IS_OF_TYPE is oriented toward
55500 separating cases. It either has the flavor of comparing, or of
55600 competition. We must also decide whether the result is nominal,
55700 ordinal, or of ratio caliber. These latter two never crop up, which
55800 is why we assume the test is a predicate always.
55900
56000 ⊗4SOME PART OF⊗*. We either get this simple destructive function by
56100 examples (like Shaw's program) or by translating a user-supplied
56200 English sentence describing what it does. Typically, some
56300 combinations of CAR and CDR, occasionally uses some constants (1, T,
56400 NIL) or constructive primitives (CONS, NCONC).
56500
56600 ⊗4COMPARE⊗*. We must worry about whether the result is nominal,
56700 ordinal, or ratio. We also decide whether the comparison is a JOINING
56800 FUNCTION applied to its arguments, a constant function (executed only
56900 for side effects), or a JOINING FUNCTION applied to the results of
57000 COMPAREing corresponding subparts of the arguments. This last case is
57100 both frequent and complicated. If neither argument is structured,
57200 this is impossible. If both are highly structured, their structures
57300 must have a nontrivial amount of correspondence in order to succeed.
57400 If only one argument is structured, this strongly suggest that the
57500 other one should be similarly structured. Often we go ahead and
57600 structure it without asking the user (inference by analogy.)
57700
57800 ⊗4BETTER⊗*. We've discussed this earlier. Here we should note that
57900 when called on to write a new, specialized BETTER function, it
58000 chooses either a simple function (T, NIL) to allow an optimization
58100 (replace by CONS, replace by NCONC1), or it chooses a complex
58200 comparing function (e.g., alphorder, write-program itself!).
58300
58400 ⊗4JOINING FUNCTION⊗*. This is a way of condensing results. It is
58500 typically a known function, such as AND, OR, PLUS, MAXIMUM, PROGN...
58600 which is determined by (i) the character of the result (numerical,
58700 T/F) and (ii) whether we are in a DO UNTIL loop, a DO REPEATEDLY
58800 loop, or neither of these loops.
58900
59000
59100 ⊗5F. Programming Knowledge stored in variables⊗*
59200
59300 To resolve an ADAPTATION decision: Ask for sample dialog,
59400 symbolically run current code, modifying I/O formats.
59500
59600 To defer any decision whose AFFECTS part is known: It may
59700 translate as some detail of x; in that case, wait until some code for
59800 x already exists. The affects part may translate as The x algorithm;
59900 in this case we worry as soon as PUP6 begins DO-ing any coding at all
60000 of x.
60100
60200 To defer an ALTERNATIVES decision: Examine the coding tasks
60300 in all cases, and try to find some common head and/or some common
60400 tail. If there is any, try to defer the decision until after writing
60500 code for this common subtask sequence. If one alternative exactly
60600 matches the common sequence, we can temporarily assert that the
60700 decision has been made to do this simplest BEING.
60800
60900 To terminate an AND loop: The conjunction is usually between
61000 highly similar objects. Related to how to parse a sentence containing
61100 ANDs.
61200
61300 To resolve a BOOLEAN decision: Ask the user to respond Yes or
61400 No. The decision has special Yes, No, and Both parts; in each case,
61500 two of them will be evalled.
61600
61700 To resolve a DEFINITION decision: Locate the defined object,
61800 reaffirm that it is undefined, ask the user to define it, check
61900 whether it is a predicate, data structure, etc., and tell the user
62000 about such constraints.
62100
62200 To resolve a DICHOTOMY decision: Each such decision contains
62300 a little program which now is evaluated. If it succeeds, its value
62400 answers the decision; if it fails, we have to ask the user to choose
62500 alternative 1 or 2. The choice points to a little program to run, and
62600 its value is the desired resolution.
62700
62800 To resolve a PREDICATE decision: Fix the context of the
62900 predicate call, try to relate this predicate to some known tests,
63000 and, if this fails, ask the user. His response is specially processed
63100 in the context of a predicate fixation.
63200
63300 To set up a data structure stored as a LIST STRUCTURE: The
63400 initialization is a simple SETQ to an as-yet-undefined value. Access
63500 is simply by mentioning the name. Insertion is by Merge:in or Merge.
63600 Deletion is by Pullout or Setdifference.
63700
63800 To set up a data structure stored as a PROPERTY LIST
63900 STRUCTURE: Delineate the pieces to be associated with each atom, and
64000 name them. Accession is via GETP. Insertion s via a GETP, then a
64100 MERGE:IN, then a PUT. Deletion is via a GETP, then a PULLOUT, then a
64200 PUT. Initialization is a PUT of a SETQ to an as-yet undefined value.
64300 Each named substructure must itself become a data structure,
64400 typically a simple list structure.
64500
64600 To resolve a SOMEOF decision: The user must be sufficiently
64700 informed about the choices. He types back a non-null, ordered subset
64800 of those choices. We examine them to see if they should be enclosed
64900 in a PROGN or in a REPEATEDLY, and whether we would like to see
65000 something structured in a certain way.
65100
65200 To resolve a SUBSETOF decision: Similar to above, but no
65300 fancy investigation, just string the choices together in a PROGN.
65400
65500 To defer any decision whose WHEN part is provided: We
65600 transform "before deciding firmly how to ←x ←←y" into something like
65700 "member (cons detail-of-$x-ing $y) doing-pup-list". We also translate
65800 Before any ←←x routine is finalized, After ←←x routines are
65900 finalized, and similar phrases. These must always evaluate T or NIL.
66000
66100
66200 ⊗5G. Demons and functions of interest⊗*
66300
66400 LIST:JOIN, MERGE:IN, PULLOUT. These rather standard functions are
66500 given a tiny bit of advice: if their "element" is more like a sublist
66600 than an element of their "list", then they assume that what was meant
66700 was append or setdifference, not cons or merge or remove.
66800
66900 LONG NAME DEMON. If any name becomes unwieldy, he notices it and asks
67000 the user for a nickname.
67100
67200 PERMIT DETAILED DECISION. Implicit near the beginning of each
67300 decision, PUP6 called this function. It asks the user if he wants
67400 more details, and if so it gets them and prints them out.
67500
67600 STRUCTURE INDUCING DEMON. If the object is a BEING already, special
67700 considerations apply. If the object and all arg values are
67800 ill-defined, we decide not to do any structuring. Otherwise, we
67900 investigate the effects of structuring the object into the pieces
68000 specified in the args. If there is no problem, and if the user
68100 consents, we tack the appropriate Replace messages onto the coding
68200 warning list (with a high priority). We activate Long Name Demon.
68300
68400 ⊗5Natural Language Translation⊗*. We have already discussed the
68500 TRANSLATE BEING and the basic way of handling natural language input.
68600 Several BEINGs exist primarily for this purpose; RECOGNIZE:ARGS,
68700 RECOGNIZE:C*R, RECOGNIZE:CONDITIONAL, RECOGNIZE:CONJUNCTION,
68800 RECOGNIZE:EQUALITY, RECOGNIZE:FUNCTION:RETURNS, RECOGNIZE:INCLUSION,
68900 RECOGNIZE:LITERALS, RECOGNIZE:NUMBER, RECOGNIZE:SET:RELATIONS,
69000 RECOGNIZE:SOME:MEMBER, ADD:DEFINITION, ADJECTIVE:HANDLER, REPEATEDLY.
69100 Also, there are several functions related to translation: e.g.,
69200 UNGERUNDIFY, PLURAL, OPPOSITE. All these are straight-forward, and
69300 their task is obvious from their name.
00100
00200 .SELECT 1
00300
00400 ⊗5(ii) The increment of knowledge necessary to write GI⊗*
00500
00600 ⊗4APPLY RULE⊗*. This BEING accepts two arguments, which must be
00700 interpretable as a string and a rule, respectively. The string is
00800 compared against the left side of the rule and, if applicable, the
00900 change indicated by the right side is executed on the string. It is
01000 immediately encodable if the user wishes all possible applications of
01100 the rule to the string; else a more specialized version must be
01200 synthesized.
01300
01400 ⊗4CONSTRAIN⊗*. Try first to decide if it is meaningful for the given
01500 structure to be constrained, and, if so, how. Next see whether any
01600 other BEINGs can help. Finally, ask the user to specify any
01700 constraints he can think of. For example, can a list structure grow
01800 indefinitely, or can we use some fancy programming trick because the
01900 size stays small?
02000
02100 ⊗4GRAMMATICAL INFERENCE⊗*. Needless to say, this is a high-level,
02200 domain-specific BEING! It must infer grammars from exemplary strings;
02300 it knows this to be the only reasonable g.i. paradigm. If it fails,
02400 some genralizations are inductive inference, enumeration,
02500 problem-solving; some alternatives are concept formation and
02600 simulated evolution. There are many minor decisions, similar to the
02700 concept formation decisions (e.g., examine a sample GI-user dialogue
02800 to finalize the printing formats). The major decision is dichotomous:
02900 whether our new specialized BEING should retain the ability to input
03000 the type of grammar and vary its strategy accordingly, or whether
03100 only one fixed type of grammar (e.g., context-free) will always be
03200 used, and may be "built in" to the code. The result of this decision
03300 is to pass control to one or the other of the following two BEINGS.
03400
03500 ⊗4INFER MULTI-CLASS GRAMMARS⊗*. We read in the type of grammar, and
03600 then call the appropriate specialist for that type. Thus we have a
03700 big switch here.
03800
03900 ⊗4INFER FIXED-CLASS GRAMMARS⊗*. This routine determines at
04000 program-synthesis time what the class is going to be, and thus will
04100 be a fixed call to one of the following four BEINGS. The speedups
04200 will arise from using the constraints on the rules.
04300
04400 ⊗4INFER PHRASE-STRUCTURE GRAMMARS⊗*. There are no rule constraints in
04500 a type 0 grammar; each half of the rule is viewed as an arbitrary
04600 list of letters. When a TEST is indicated, the
04700 fringe-of-conciousness demon must report it is thinking of PARSE. The
04800 left and right sides of a rule will be destructive operations on the
04900 data representation of a rule.
05000
05100 ⊗4INFER CONTEXT-SENSITIVE GRAMMARS⊗*. We shall only report on the
05200 differences between the INFER... BEINGS. This one knows that the
05300 right side of the rule must be at least as long as the left side.
05400 This will be used as a pruning heuristic when proposing plausible
05500 rules.
05600
05700 ⊗4INFER CONTEXT-FREE GRAMMARS⊗*. Grammars of type 2 have as their
05800 left side a single nonterminal. Further simplifications can occur by
05900 only working toward a Griebach Normal Form or Chomsky Normal Form
06000 grammar, although from the standpoint of inference energy these are
06100 harder.
06200
06300 ⊗4INFER REGULAR GRAMMARS⊗*. A type three grammar has a unit on the
06400 left and a pair of them for a right side (one terminal, one
06500 nonterminal). This is a very powerful pruning heuristic.
06600
06700 ⊗4MAJOR MODIFY STRUCTURE⊗*. The old, simple "insert,delete,alter"
06800 paradigm of modification was no longer sufficient. This BEING heads a
06900 whole complex of modify BEINGS, including the old one as the
07000 low-level workhorse primitive. Here is a sketch of the organization:
07100 .BEGIN NOFILL
07200 MAJOR-MODIFY-STRUCTURE
07300 / | \
07400 MODIFY-UNTIL | MODIFY-SOME
07500 \ | /
07600 THE-ORIGINAL-MODIFY-STRUCTURE-BEING
07700 .END
07800 So this top-level modifier calls some subset of the three lower
07900 modification BEINGS. Later, we had to add a fourth alternative,
08000 EXAMINE DATA STRUCTURE, to aid in writing the INF program.
08100
08200 ⊗4MODIFY SOME⊗*. We determine a set S and a predicate P at synthesis
08300 time. At run time, we map through S and apply P; all elements
08400 responding positively are modified (using the original MODIFY
08500 STRUCTURE.) The decisions about P and S are easily deferrable.
08600
08700 ⊗4MODIFY UNTIL⊗*. This BEING is simply an order to compose
08800 REPEATEDLY π. MODIFY_STRUCTURE. The former BEING bears the brunt of
08900 the responsibility of the interface.
09000
09100 ⊗4PARSE⊗*. Attempt to parse a string from the current set of rules,
09200 by reversing each rule and composing their applications. We decide
09300 at synthesis time whehter or not to maintain a parse tree, whether or
09400 not to maintain a list of the rules used during the parse, whether we
09500 stop parsing at any legal string or only at "S," whether we parse
09600 forwards or backwards or both, how deeply in each direction (this is
09700 always deferred until much later), whether one direction after
09800 another or (simulated) simultaneously. We look for the DataStructure
09900 part of the BEING representing a rule, and ask him about constraints
10000 on the rule, and about how to destructively recover the left and
10100 right sides separately.
10200
10300 ⊗4PARSE BACKWARD⊗*. This BEING is given two strings and a set of
10400 rules; the task is to apply anti-rules to the target string until it
10500 becomes the initial string. This is typically done breadth-first.
10600 Special modifications must be made if there is a parse tree to
10700 maintain, if a set of rules used must be maintained, etc. The basic
10800 body is a nest of FOREACH calls (∀rule, ∀way of applying the rule to
10900 the string, recurse). To avoid infinite recursion, we must supply a
11000 third argument: the depth to which we compose these anti-rules before
11100 we give up. When calling itself recursively, this level will be
11200 decremented.
11300
11400 ⊗4PARSE FORWARD⊗*. This BEING is analogous to the previous one, using
11500 rules themselves instead of anti-rules. Notice how clearly the place
11600 to insert searching heuristics is marked out for us (although none
11700 are present.)
11800
11900 ⊗4STRING⊗*. This is a structure whose parts are a name, a list of
12000 letters, a set of comments. It is advisable to use list structures
12100 rather than property lists to represent strings, since they will
12200 probably only be accessed by one of their three parts. In the GI
12300 program, we don't use STRING itself, but rather we mention
12400 UNCOMMENTED STRING, which causes this BEING to create a new
12500 specialized version of itself, sans the third, comments part.
00100
00200 ⊗5(iii) The increment of knowledge necessary to write INF⊗*
00300
00400 ⊗4EXAMINE STRUCTURE⊗*. This is another one of the parts of a major
00500 MODIFY structure. If the fringe of conciousness demon can't come up
00600 with a reasonable matching function, one is selected now. The basic
00700 body says to do PATTERN MATCH, using this match function, and convey
00800 the results to the caller (who may be the user.) The inputs are thus
00900 a pattern, a data structure name, and possibly a hint to a match
01000 function.
01100
01200 ⊗4PATTERN MATCH⊗*. This existed as a system function earlier, but for
01300 INF it is necessary to write a tailored pattern-matcher. In
01400 particular, INF demands that we strip away the common head and tail
01500 from both pattern and expression, and then compose the two remaining
01600 pieces into the left and right sides of a new plausible rule, and
01700 then check that this conforms with the constraints on rules. This is
01800 certainly different from the type of match needed by CF! Notice that
01900 we had to add the "eliminate common head/tail" functions to our list
02000 of system primitive functions.